Adaptive Checkpointing Schemes for Fault Tolerance in Real-Time Systems with Task Duplication
نویسندگان
چکیده
Dynamic adaptation techniques based on checkpointing is studied in this paper. Placing store-checkpoints and compare-checkpoints between CSCP (store-and-compare-checkpoint), we first present adaptive checkpointing schemes in which the checkpointing interval for a task is dynamically adjusted on line. Introducing the overheads of comparison and storage, the average execution times to complete a task for proposed schemes are obtained, using renewal equations. Further, we have discussed analytically the optimal numbers of checkpoints that minimize the average execution times. We then extend proposed schemes to a set of multiple tasks in real-time systems. Simulation results show that compared to previous method, the proposed approach significantly increases the likelihood of timely task completion.
منابع مشابه
Analysis of checkpointing for schedulability of real-time systems
Checkpointing is a relatively cost effective method for achieving fault tolerance in real-time systems. Since checkpointing schemes depend on time redundancy, they could affect the correctness of the system by causing deadlines to be missed. This paper provides exact schedulability tests for fault tolerant task sets under specified failure hypothesis and employing checkpointing to assist in fau...
متن کاملStability Assessment Metamorphic Approach (SAMA) for Effective Scheduling based on Fault Tolerance in Computational Grid
Grid Computing allows coordinated and controlled resource sharing and problem solving in multi-institutional, dynamic virtual organizations. Moreover, fault tolerance and task scheduling is an important issue for large scale computational grid because of its unreliable nature of grid resources. Commonly exploited techniques to realize fault tolerance is periodic Checkpointing that periodically ...
متن کاملAnalysis of Checkpointing Schemes for Multiprocessor Systems
Parallel computing systems provide hardware redundancy that helps t o achieve low cost fault-tolerance, by duplicating the task into more than a single processor, and comparing the states of the processors a t checkpoints. This paper suggests a novel technique, based on a Markov Reward Model (MRM) , f o r analyzing the performance of checkpointing schemes with task duplication. W e show how thi...
متن کاملFault Recovery Based on Checkpointing for Hard Real-Time Embedded Systems
Safety-critical embedded systems often operate in harsh environmental conditions that necessitate fault-tolerant computing techniques. Many safety-critical systems also execute realtime applications. The correctness of these systems depends not only on the logical result of computation, but also on the time at which the results are produced. The missing of task deadlines can therefore be viewed...
متن کاملAdaptive Checkpointing
Checkpointing is a typical approach to tolerate failures in today’s supercomputing clusters and computational grids. Checkpoint data can be saved either in central stable storage, or in processor memory (as in diskless checkpointing), or local disk space (replacing memory with local disk in diskless checkpointing). But where to save the checkpoint data has a great impact on the performance of a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006